35 research outputs found

    Quantifying & characterizing information diets of social media users

    Get PDF
    An increasing number of people are relying on online social media platforms like Twitter and Facebook to consume news and information about the world around them. This change has led to a paradigm shift in the way news and information is exchanged in our society – from traditional mass media to online social media. With the changing environment, it’s essential to study the information consumption of social media users and to audit how automated algorithms (like search and recommendation systems) are modifying the information that social media users consume. In this thesis, we fulfill this high-level goal with a two-fold approach. First, we propose the concept of information diets as the composition of information produced or consumed. Next, we quantify the diversity and bias in the information diets that social media users consume via the three main consumption channels on social media platforms: (a) word of mouth channels that users curate for themselves by creating social links, (b) recommendations that platform providers give to the users, and (c) search systems that users use to find interesting information on these platforms. We measure the information diets of social media users along three different dimensions of topics, geographic sources, and political perspectives. Our work is aimed at making social media users aware of the potential biases in their consumed diets, and at encouraging the development of novel mechanisms for mitigating the effects of these biases.Immer mehr Menschen verwenden soziale Medien, z.B. Twitter und Facebook, als Quelle für Nachrichten und Informationen aus ihrem Umfeld. Diese Entwicklung hat zu einem Paradigmenwechsel hinsichtlich der Art undWeise, wie Informationen und Nachrichten in unserer Gesellschaft ausgetauscht werden, geführt – weg von klassischen Massenmedien hin zu internetbasierten Sozialen Medien. Angesichts dieser veränderten (Informations-) Umwelt ist es von entscheidender Bedeutung, den Informationskonsum von Social Media-Nutzern zu untersuchen und zu prüfen, wie automatisierte Algorithmen (z.B. Such- und Empfehlungssysteme) die Informationen verändern, die Social Media- Nutzer aufnehmen. In der vorliegenden Arbeit wird diese Aufgabenstellung wie folgt angegangen: Zunächst wird das Konzept der “Information Diets” eingeführt, das eine Zusammensetzung aus produzierten und konsumierten Social Media-Inhalten darstellt. Als nächstes werden die Vielfalt und die Verzerrung (der sogenannte “Bias”) der “Information Diets” quantifiziert die Social Media-Nutzer über die drei hauptsächlichen Social Media- Kanäle konsumieren: (a) persönliche Empfehlungen und Auswahlen, die die Nutzer manuell pflegen und wodurch sie soziale Verbindungen (social links) erzeugen, (b) Empfehlungen, die dem Nutzer von der Social Media-Plattform bereitgestellt werden und (c) Suchsysteme der Plattform, die die Nutzer für ihren Informationsbedarf verwenden. Die “Information Diets” der Social Media-Nutzer werden hierbei anhand der drei Dimensionen Themen, geographische Lage und politische Ansichten gemessen. Diese Arbeit zielt zum einen darauf ab, Social Media-Nutzer auf die möglichen Verzerrungen in ihrer “Information Diet” aufmerksam zu machen. Des Weiteren soll diese Arbeit auch dazu anregen, neuartige Mechanismen und Algorithmen zu entwickeln, um solche Verzerrungen abzuschwächen

    Do Neural Ranking Models Intensify Gender Bias?

    Full text link
    Concerns regarding the footprint of societal biases in information retrieval (IR) systems have been raised in several previous studies. In this work, we examine various recent IR models from the perspective of the degree of gender bias in their retrieval results. To this end, we first provide a bias measurement framework which includes two metrics to quantify the degree of the unbalanced presence of gender-related concepts in a given IR model's ranking list. To examine IR models by means of the framework, we create a dataset of non-gendered queries, selected by human annotators. Applying these queries to the MS MARCO Passage retrieval collection, we then measure the gender bias of a BM25 model and several recent neural ranking models. The results show that while all models are strongly biased toward male, the neural models, and in particular the ones based on contextualized embedding models, significantly intensify gender bias. Our experiments also show an overall increase in the gender bias of neural models when they exploit transfer learning, namely when they use (already biased) pre-trained embeddings.Comment: In Proceedings of ACM SIGIR 202

    Characterizing Information Diets of Social Media Users

    Full text link
    With the widespread adoption of social media sites like Twitter and Facebook, there has been a shift in the way information is produced and consumed. Earlier, the only producers of information were traditional news organizations, which broadcast the same carefully-edited information to all consumers over mass media channels. Whereas, now, in online social media, any user can be a producer of information, and every user selects which other users she connects to, thereby choosing the information she consumes. Moreover, the personalized recommendations that most social media sites provide also contribute towards the information consumed by individual users. In this work, we define a concept of information diet -- which is the topical distribution of a given set of information items (e.g., tweets) -- to characterize the information produced and consumed by various types of users in the popular Twitter social media. At a high level, we find that (i) popular users mostly produce very specialized diets focusing on only a few topics; in fact, news organizations (e.g., NYTimes) produce much more focused diets on social media as compared to their mass media diets, (ii) most users' consumption diets are primarily focused towards one or two topics of their interest, and (iii) the personalized recommendations provided by Twitter help to mitigate some of the topical imbalances in the users' consumption diets, by adding information on diverse topics apart from the users' primary topics of interest.Comment: In Proceeding of International AAAI Conference on Web and Social Media (ICWSM), Oxford, UK, May 201

    Where the earth is flat and 9/11 is an inside job: A comparative algorithm audit of conspiratorial information in web search results

    Get PDF
    Web search engines are important online information intermediaries that are frequently used and highly trusted by the public despite multiple evidence of their outputs being subjected to inaccuracies and biases. One form of such inaccuracy, which so far received little scholarly attention, is the presence of conspiratorial information, namely pages promoting conspiracy theories. We address this gap by conducting a comparative algorithm audit to examine the distribution of conspiratorial information in search results across five search engines: Google, Bing, DuckDuckGo, Yahoo and Yandex. Using a virtual agent-based infrastructure, we systematically collect search outputs for six conspiracy theory-related queries (“flat earth”, “new world order”, “qanon”, “9/11”, “illuminati”, “george soros”) across three locations (two in the US and one in the UK) and two waves (March and May 2021). We find that all search engines except Google consistently displayed conspiracy-promoting results and returned links to conspiracy-dedicated websites, with variations across queries. Most conspiracy-promoting results came from social media and conspiracy-dedicated websites while conspiracy-debunking information was shared by scientific websites and legacy media. These observations are consistent across different locations and time periods highlighting the possibility that some engines systematically prioritize conspiracy-promoting content

    Novelty in news search: a longitudinal study of the 2020 US elections

    Full text link
    The 2020 US elections news coverage was extensive, with new pieces of information generated rapidly. This evolving scenario presented an opportunity to study the performance of search engines in a context in which they had to quickly process information as it was published. We analyze novelty, a measurement of new items that emerge in the top news search results, to compare the coverage and visibility of different topics. We conduct a longitudinal study of news results of five search engines collected in short-bursts (every 21 minutes) from two regions (Oregon, US and Frankfurt, Germany), starting on election day and lasting until one day after the announcement of Biden as the winner. We find more new items emerging for election related queries ("joe biden", "donald trump" and "us elections") compared to topical (e.g., "coronavirus") or stable (e.g., "holocaust") queries. We demonstrate differences across search engines and regions over time, and we highlight imbalances between candidate queries. When it comes to news search, search engines are responsible for such imbalances, either due to their algorithms or the set of news sources they rely on. We argue that such imbalances affect the visibility of political candidates in news searches during electoral periods

    What do Twitter comments tell about news article bias? Assessing the impact of news article bias on its perception on Twitter

    Full text link
    News stories circulating online, especially on social media platforms, are nowadays a primary source of information. Given the nature of social media, news no longer are just news, but they are embedded in the conversations of users interacting with them. This is particularly relevant for inaccurate information or even outright misinformation because user interaction has a crucial impact on whether information is uncritically disseminated or not. Biased coverage has been shown to affect personal decision-making. Still, it remains an open question whether users are aware of the biased reporting they encounter and how they react to it. The latter is particularly relevant given that user reactions help contextualize reporting for other users and can thus help mitigate but may also exacerbate the impact of biased media coverage. This paper approaches the question from a measurement point of view, examining whether reactions to news articles on Twitter can serve as bias indicators, i.e., whether how users comment on a given article relates to its actual level of bias. We first give an overview of research on media bias before discussing key concepts related to how individuals engage with online content, focusing on the sentiment (or valance) of comments and on outright hate speech. We then present the first dataset connecting reliable human-made media bias classifications of news articles with the reactions these articles received on Twitter. We call our dataset BAT - Bias And Twitter. BAT covers 2,800 (bias-rated) news articles from 255 English-speaking news outlets. Additionally, BAT includes 175,807 comments and retweets referring to the articles. Based on BAT, we conduct a multi-feature analysis to identify comment characteristics and analyze whether Twitter reactions correlate with an article’s bias. First, we fine-tune and apply two XLNet-based classifiers for hate speech detection and sentiment analysis. Second, we relate the results of the classifiers to the article bias annotations within a multi-level regression. The results show that Twitter reactions to an article indicate its bias, and vice-versa. With a regression coefficient of 0.703 ( ), we specifically present evidence that Twitter reactions to biased articles are significantly more hateful. Our analysis shows that the news outlet’s individual stance reinforces the hate-bias relationship. In future work, we will extend the dataset and analysis, including additional concepts related to media bias

    Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data

    Full text link
    Understanding human activities and movements on the Web is not only important for computational social scientists but can also offer valuable guidance for the design of online systems for recommendations, caching, advertising, and personalization. In this work, we demonstrate that people tend to follow routines on the Web, and these repetitive patterns of web visits increase their browsing behavior's achievable predictability. We present an information-theoretic framework for measuring the uncertainty and theoretical limits of predictability of human mobility on the Web. We systematically assess the impact of different design decisions on the measurement. We apply the framework to a web tracking dataset of German internet users. Our empirical results highlight that individual's routines on the Web make their browsing behavior predictable to 85% on average, though the value varies across individuals. We observe that these differences in the users' predictabilities can be explained to some extent by their demographic and behavioral attributes.Comment: 12 pages, 8 figures. To be published in the proceedings of the International AAAI Conference on Web and Social Media (ICWSM) 202

    Who are the users of ChatGPT? Implications for the digital divide from web tracking data

    Full text link
    A major challenge of our time is reducing disparities in access to and effective use of digital technologies, with recent discussions highlighting the role of AI in exacerbating the digital divide. We examine user characteristics that predict usage of the AI-powered conversational agent ChatGPT. We combine web tracking and survey data of N=1068 German citizens to investigate differences in activity (usage, visits and duration on chat.openai.com). We examine socio-demographics commonly associated with the digital divide and explore further socio-political attributes identified via stability selection in Lasso regressions. We confirm lower age and more education to affect ChatGPT usage, but not gender and income. We find full-time employment and more children to be barriers to ChatGPT activity. Rural residence, writing and social media activities, as well as more political knowledge, were positively associated with ChatGPT activity. Our research informs efforts to address digital disparities and promote digital literacy among underserved populations

    Gender, Age, and Technology Education Influence the Adoption and Appropriation of LLMs

    Full text link
    Large Language Models (LLMs) such as ChatGPT have become increasingly integrated into critical activities of daily life, raising concerns about equitable access and utilization across diverse demographics. This study investigates the usage of LLMs among 1,500 representative US citizens. Remarkably, 42% of participants reported utilizing an LLM. Our findings reveal a gender gap in LLM technology adoption (more male users than female users) with complex interaction patterns regarding age. Technology-related education eliminates the gender gap in our sample. Moreover, expert users are more likely than novices to list professional tasks as typical application scenarios, suggesting discrepancies in effective usage at the workplace. These results underscore the importance of providing education in artificial intelligence in our technology-driven society to promote equitable access to and benefits from LLMs. We urge for both international replication beyond the US and longitudinal observation of adoption

    Misinformation, Believability, and Vaccine Acceptance Over 40 Countries: Takeaways From the Initial Phase of The COVID-19 Infodemic

    Full text link
    The COVID-19 pandemic has been damaging to the lives of people all around the world. Accompanied by the pandemic is an infodemic, an abundant and uncontrolled spreading of potentially harmful misinformation. The infodemic may severely change the pandemic's course by interfering with public health interventions such as wearing masks, social distancing, and vaccination. In particular, the impact of the infodemic on vaccination is critical because it holds the key to reverting to pre-pandemic normalcy. This paper presents findings from a global survey on the extent of worldwide exposure to the COVID-19 infodemic, assesses different populations' susceptibility to false claims, and analyzes its association with vaccine acceptance. Based on responses gathered from over 18,400 individuals from 40 countries, we find a strong association between perceived believability of misinformation and vaccination hesitancy. Additionally, our study shows that only half of the online users exposed to rumors might have seen the fact-checked information. Moreover, depending on the country, between 6% and 37% of individuals considered these rumors believable. Our survey also shows that poorer regions are more susceptible to encountering and believing COVID-19 misinformation. We discuss implications of our findings on public campaigns that proactively spread accurate information to countries that are more susceptible to the infodemic. We also highlight fact-checking platforms' role in better identifying and prioritizing claims that are perceived to be believable and have wide exposure. Our findings give insights into better handling of risk communication during the initial phase of a future pandemic
    corecore